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Abstract 



This paper studies biased riffle shuffles, first defined by Diaconis, Fill, and Pitman. These 
shuffles generalize the well-studied Gilbert-Shannon- Reeds shuffle and convolve nicely. An upper 
bound is given for the time for these shuffles to converge to the uniform distribution; this 
matches lower bounds of Lalley. A careful version of a bijection of Gessel leads to a generating 
function for cycle structure after one of these shuffles and gives new results about descents in 
0^ . random permutations. Results are also obtained about the inversion and descent structure of a 

permutation after one of these shuffles. 

o 

^ ! 1 Introduction and Background 

C^ , The most widely used method of shuffling cards is riffle shuffling. Roughly speaking, one cuts the 

deck of cards into two piles of approximately equal size and then riffles the two piles together. A 
precise mathematical model of riffle shuffles is the Gilbert-Shannon-Reeds (or GSR) shuffle, found 
independently by Gilbert M and Reeds [O]. This model says to first cut the n card deck into two 

>. ' (") 

J--^ • packs of size m and n — m with probability ■^^. Then drop cards from these packs one at a time, 

■^ . such that if pack 1 has Ai cards and pack 2 has A2 cards, the next card is dropped from pack 1 

fN I with probability ^ J-^ and from pack 2 with probability ^ ^^ . 

Before defining biased shuffles, let us recall the notion of the descent set of a permutation. An 

[■"» . element vr S 5„ is said to have a descent at position i if 7r(z) > 7r(i -|- 1). By convention we say that 

^^ I all vr G Sn have a descent at position n. The descent set of vr is the set of positions at which vr has 

,^ ' a descent. 

C^ ■ This paper analyzes a notion of biased riffle shuffles which generalizes the GSR shuffle (the GSR 

shuffle will correspond to the case a = 2,pi = P2 = ^)- These biased shuffles seem to have first 
been considered on pages 153-4 of Diaconis, Fill, and Pitman [S]. We now give four descriptions 
of these biased riffle shuffles. These descriptions generalize the descriptions of the GSR shuffle in 
Bayer and Diaconis |l[. It is elementary to prove that these descriptions are equivalent. 



Descriptions of Biased a-shuffles 

1. Cut the n card deck into a piles by picking pile sizes according to the mult{a;p) law, where 
P = (Pii ■ ■ ■ ,Pa)- In other words, choose bi,- ■ ■ ,ba with probability: 




Then choose uniformly one of the (^ ""^ ) ways of interleaving these packets, leaving the cards 
in each packet in their original relative order. (In the language of descents, choose uniformly 
one of the (j^ ""j^ ) permutations whose inverse has descent set contained in {61, bi + b2,- ■ ■ ,bi + 
••• + &a = n})." 



2. As in Description 1, cut the n card deck into a piles according to the mult{a;p) law. Now 
drop cards from the a packets one at a time, according to the rule that if the ith packet has 
Ai cards, then the next card is dropped from the ith packet with probability ^ +-^+A ■ 

3. Drop n points in [0, 1] according to the following procedure. Break the unit interval into a 
sub- intervals of length -. Pick the ith interval with probability proportional to pi. Then drop 
uniformly in this interval. Label the points xi, • • • , j;„ in order of smallest to largest. The 
map X I— > ax (mod 1) reorders these points. The induced measure on Sn is the same as in 
Descriptions 1 and 2. 

4. The inverse of a biased a-shufSe has the following description. Start with an ordered deck of 
n cards face down. Successively and independently, cards are turned face up and dealt into a 
random pile i with probability proportional to pi. After all the cards have been distributed, 
the piles are assembled from left to right and the deck is turned face down. 

We denote the measure on Sn defined by Descriptions 1-4 by Pn,a;p- For example, one can check 
that the measure -P3,2;pi,i-pi assigns to permutations in cycle form the following masses: 



(1)(2)(3) 


Pi + PiP2 + PipI + pI 


(1)(23) 


pip2 


(2)(13) 





(3)(12) 


plpl 


(123) 


plpl 


(132) 


pIp2 



li p = (pi,- • • ,Pa) and p' = {p[,- ■ ■ ,p'f,), define the product: 

p(g)p' = {p^p'^^ ■ . . ^p^p[^ . . . ^p^p[^ ■ . . ^p^p[) 

The following fact, which shows that biased riffle shuffles convolve well, is stated without proof 
in Diaconis, Fill, and Pitman |^]. 

Proposition 1 The convolution of P„ „■„ and P , ^, is P , ^^-;. 

f J ii,u,p n,b]p' n,ab;p®p' 

Proof: This follows from the inverse description of card shuffling. Lexicographically combining the 
pile assignments from an inverse a-shuffle and an inverse b-shuffles gives uniform and independent 
pile assignments for an inverse ab-shuffle. □ 

Proposition Q is the starting point for this paper. Little seems to be known about biased riffle 
shuffles. The Gilbert-Shannon- Reeds shuffle (the case of equal pj), however, has been fairly well 
studied (e.g. Bayer and Diaconis Q or Diaconis, McGrath, and Pitman [Q]). 

2 Bounding the Time to Uniform 

This section uses the concept of a strong uniform time to upper bound the time for biased riffle 
shuffles to get close to the uniform distribution. The bounds obtained are of the same order as 
lower bounds due to Lalley P]. 



Recall that the total variation distance between two probability distributions Pi and P2 on a 
set X is defined as: 

IIA-P2|| = |EI^i(^)-^2(x)| 

Let P*'^ denote the fc-fold convolution of P. Let U be the uniform distribution on Sn- 
Theorem 1 

\\p:%,-m<{^^\pl + ---+plf 

Proof: For each k, let A be a random n * k matrix formed by letting each entry equal i with 
probability pi. Note that the random matrix A^ corresponds to a random permutation under the 
measure Pnan- '^° ^^^ this, recall Description 4 of biased riffle shuffles (the inverse description). A 
single inverse a shuffle corresponds to a column of A by letting the ith entry in the column of A 
equal the pile into which card i is placed. 

Let T be the first time that the rows of A are distinct. It is not hard to see that T is a strong 
uniform time for P^\.^ in the sense of Sections 4B-4D of Diaconis Q. Namely, the permutation 
associated to the matrix A^ is uniform. This is because, as in Proposition |l|, the inverse of the k 
fold convolution of a-shuffles may be viewed as inverse sorting into a^ piles, and at time T each pile 
has at most 1 card. Symmetry implies that these cards are in uniform random order. It is proved 
on page 76 of Diaconis [^ that: 

\p:':a;p-u\<PToh{T>k) 

Let Vij be the event that rows i and j of A^ are the same. The probability that Vij occurs is 
[Pi + • • • + Pa] • The theorem follows since: 



Prob{T > k) = Prob{Ui<i<j<n)Aij 
< J2 ProKAj) 



l<i<j<n 



n 



,2 , , 2ik 



, 2 ; L:Pi + ■ ■ ■ + PaJ 

D 
Remarks 

1 . Theorem |l] shows that k = 2log 1 n steps suffice to get close to the uniform distribution 
(in the case a = 2,pi = p2 = ^ this is 2log2n). 

2. Lalley [H] proved that there exists an open neighborhood of pi = i such that for all pi in this 
neighborhood, a Pn,2;pi,p2 shuffle takes at least 

3 + 0, 

loQ 1 n 

4 7(^7, 



steps to get close to the uniform distribution. Here 9 = 6p^ is the unique real number such 
that 

Note that when Pi = P2 = \ this bound is ^log2n, which is of the same order as the 2log2n 
bound of Theorem Q, and agrees exactly with the more refined analysis of Bayer and Diaconis 
g for the GSR shuffles. 

3 Gessel's Bijection and Cycle Structure 

This section begins by describing a bijection of Gessel Q. This requires some preliminary notation 
and concepts. Recall that a permutation tt G S^ is said to have a descent at position i if 7r(i) > 
7r(i + 1). We adopt the convention that all n € Sn have a descent at position n. Define a necklace 
on an alphabet to be a sequence of cyclically arranged letters of the alphabet. A necklace is said 
to be primitive if it is not equal to any of its non-trivial cyclic shifts. For example, the necklace 
{a a b b) is primitive, but the necklace {a b a b) is not. 

Given a word w of length n on an ordered alphabet, the 2-row form of the standard permutation 
st{w) G Sn is defined as follows. Write w under 1 • • • n and then write under each letter of w its 
lexicographic order in w, where if two letters of w are the same, the one to the left is considered 
smaller. For example (page 195 of Gessel and Reutenauer Q): 

12 3 4 5 6 7 8 9 10 11 12 

w =bbaabcc c b c b b 

st{w) = 341259 10 11 6 12 7 8 

For a finite ordered alphabet A, Gessel and Reutenauer ^ give a bijection U from the set 
of length n words w of onto the set of finite multisets of necklaces of total size n, such that the 
cycle structure of st{'w) is equal to the cycle structure of U{w). To define U{w), one replaces each 
number in the necklace of st{w) by the letter above it. In the example, the necklace of st{w) is 
(1 3), (2 4), (5), (6 9), (7 11 8 12 10). This gives the following multiset of necklaces on A: 

(a b){a b){b){b c){b c b c c) 
Theorem ^ one of the main results of this section, will follow from this bijection. 

Theorem 2 Fix ri,- ■■ ,ra > such that X^iLi ^i = "n-- The bijection U defines by restriction a 
cycle- structure preserving bijection U from elements of Sn with descent set contained in {ri,ri + 
r2, • • • , ri + • • • -|- r^j = n} to multisets of primitive necklaces on the alphabet {1, • • • , a} formed from 
a total of ri i 's. 

Proof: Restrict U to the set of words with r^ i's. It is clear that an element vr of Sn can arise 
as the standard permutation of at most one word with r^ i's. Also, the vr which arise are precisely 
those IT such that the descent set of tt"^ is contained in {ri, ri + r2, • • • , ri + • • • + ra = n}. This 
proves the theorem. □ 

Corollary || will translate Theorem || into the language of generating functions. This uses some 
further notation. Define the quantity M(ri, • • • , r^) as: 



1 HI 



M(ri,---,r,) = - ^ /i(d)- 

d\n,ri,---,ra '^ ' '^ 



One easily proves by Moebius inversion (e.g. page 172 of Hall H) that M(ri,---,ra) is the 
number of primitive circular words from an alphabet {1, ■ ■ ■ ,a} in which the letter i appears rj 
times. 

Recall that we are using the convention that all permutations in Sn have a descent at position 
n. For 6j, n, > 0, let 6 = (6i, • • • , ha) and n = {ni,n2, • • •)• Let Ar ^ be the number of permutations 
on bi + ■■■ + ba letters with descent set contained in {bi, 6i + 62, • • • , 6i + • • • + ba} and n^ i-cycles. 

Corollary 1 For all a > 1, 

a 00 -, 

b,n «=1 J «=1 '■l.-.-a>0 1 

Proof: The coefficient of Y{'i=iZj^Y{jXj^ on the left hand side is equal to A^^, the number of 
permutations oiibi + - ■ ■+ha letters with descent set contained in {61, 61+62, • • • , 6i + - • -+60} and rij 
j-cycles. Theorem ^ says that this is the number of multisets of necklaces on the alphabet {1, • • • , a} 
with bi i's and rij j-cycles. The corollary now follows from the interpretation of M(ri, • • • ,ra) as 
the number of primitive circular words of length n from an alphabet {1, • • • , a} in which the letter 
i appears tj times. □ 

Corollary |l| will be used to study the cycle structure of a permutation under the measure Pn,a,p- 
Let En^a,p denote expectation with respect to the measure Pn,a,p: ^^^ ^i denote the random variable 
on Sn such that Ni[7:) is the number of i-cycles of vr. The case of Theorem ^ with all pj = - is 
known from Diaconis, McGrath, and Pitman [Q. 

Theorem 3 

00 Af 00 I 

E-"K,.,,n-f =n n ^rip^-v^-^"''"'''^ 

n=0 i=l i=l n,...,r„>o ^ Pi P'^^ ^« 

Proof: Corollary |I| and elementary manipulations imply that: 

00 -. 00 a 

n n ( i_,n \^..... )^^"'-'"^ = E-'^ E ^Mn^^n-r 

4 = 1 ri,...,ra>0 ^ ^1 i^" " "^^ „=0 61+ .. + 6„=n i=l j 

00 / \ a 



E-" E [(6,.:jnpM[r^]n 

n=0 6i+-. + i>a=n V^l "'i/ i=l W-bJ j 



We give a probabilistic interpretation to: 



00 / \ a 4^ 

n=0 bl+ ■■ + i>a=n \"1 '^a/ i = l \bl-bj j 



The first term in square brackets is the chance that a deck cut according to the mult{n,p) 
distribution is cut into packets of size bi,- ■ ■ ,ba- To interpret the second term in square brackets, 
use the fact from page 17 of Stanley [|l^] that the total number of permutations onn = bi + - ■ ■ + ba 
letters with descent set contained in {bi, bi + b2, • " ^ bi + ■ ■ ■ + ba} is the multinomial coefficient 
(^ "^ ) . Thus the second term is equal to the chance that choosing uniformly among permutations 
on n letters whose inverse has descent set contained in {61, 61+62, • • • , 6i + - • • 6a} gives a permutation 
with rii i-cycles. This proves the theorem. □ 

As an example of an application of Theorem |^, one obtains an expression for the expected 
number of fixed points after a fc-fold convolution of the measure Pn,a,p- 

Corollary 2 The expected number of fixed points of a permutation under the k-fold convolution of 

n 

Proof: Recall from the introductory section that the A;-fold convolution of an a-shuffle with 
parameters (pi,---,Pa) is equivalent to an a^ shuffie with parameters equal to the a^ possible 
products psi • • -Ps^ where each si € {1, • • • , a} and repetition is allowed. Thus it suffices to prove 
the corollary in the case k = 1. 

In the generating function of Theorem ^ one wants to set xi = x, Xi = 1 for i > 2, then 
differentiate with respect to x, set x = 1, and finally take the coefficient of y". 

Setting xi = X, Xi = 1 for i > 2 in the generating function of Theorem |^ gives: 

1 l-piy 1 - PaV 



l-yl-pixy l-paxy 



because the xi = x term contributes -rm — 7-; 7 and the Xi = 1 for i > 2 term contributes 

\_ — - — . The corollary now follows by easy algebra. □ 
Remarks 

1. In the case of pi = -, Corollary |2| shows that the expected number of fixed points after k 
a-shuffies is: 

n -j^ 

which is known from Diaconis, McGrath, and Pitman Q. In fact Holder's inequality gives: 

— Y < Pi + • • • + pi 

so that the expected number of fixed points is smallest for unbiased riffle shuffles. 

2. It turns out that for , -a.^ 2\k ^ 1; the number of fixed points is close to its Poisson(l) 
limit. In fact fixed points (and more generally other functionals of cycle structure) approach 
their limit distribution more quickly than Pn,a,p approaches its uniform limit. 



4 Enumerative Applications of Gessel's Bijection 

This section considers some enumerative applications of Theorem ||. To begin, formulas will be 
found for the chance that an n-cycle in Sn has a given descent set J. Recall that all permutations 
in Sn are considered to have a descent at position n. We also use the notation that if J = 
{ji < J2 < • • • jd = IT'} and JQ = 0, then C{J), the composition of the descent set J, is equal to 
(ji - Jo,--- Jd - jd-l)- 



Stanley |12| gives two formulas for the number of permutations with descent set J. These will 



both turn out to have analogs for the case of n-cycles. 

Proposition 2 (Page 69 of Stanley fl^] j The number of elements of Sn with descent set J is: 



E(-i)'''''^'f 

KCJ \ 



n 



This carries over to n-cycles as follows, where M(ri, • ■ ■ , Tq) is defined as in Section ^. 
Corollary 3 The number of n-cycles with descent set J is: 

Y: (-i)I^i-i^Im(c(e:)) 

KCJ 

Proof: By Moebius inversion on the power set of {1, • • • ,n}, it suffices to show that the number 
of n cycles with descent set contained in K is M{C{K)). This follows from Theorem g. □ 

There is also a determinantal formula for the number of permutations with descent set J. 
Suppose that the elements of J are 1 < ji < J2--- < Jfc < 't- — 1- Define jo = and jk+i = n. 



Proposition 3 (Page 69 of Stanley [1£\) The number of elements of Sn with descent set J is the 



determinant of a k + 1 by k + 1 matrix, where (/, m) G [0, k] x [0, k]: 

det( "-^'^ ] 

\Jm+l - 31 ) 

This can be generalized to n-cycles. Given J, a subset of {1, ...,n — 1}, let J'^ be the subset of 
J consisting of all numbers divisible by d. If J is non-empty, label these elements 1 < jf < ^2 ' " " — 
j\jd\ < n - 1. Define j^ = and j\jd\j^-^ = n. 

Theorem 4 The number of n-cycles with descent set J is: 

Ad 
n Ji \ 



Jf 



n ^ \ "''"+1 , 

d\n \ d d '' 

Proof: From Theorem ^ the number of n-cycles with descent set J is: 



KCJ '^ KCJ d:KCJd V^Ui 



^ d\n Kcjd y-'y-dU 

lYi^{d){-if\-\^'\ E(-i)''^'-'"'LL^ 

"" d\n Knd y^\~dh 



Proposition ^ shows that J2kc.J''{~^) '^ ir('^'^d ^^ *^^ number of permutations on ^ symbols 

with descent set ^. The theorem then fohows from Proposition |3|. □ 

The enumeration of matrices with fixed row and column sums is related to some problems in 
statistics (see for instance the work of Diaconis and Sturmfels ||5[ ) . Proposition Q relates the theory 
of such matrices to the theory of descents in involutions. 

Proposition 4 The number of involutions in Sn with descent set contained in K = {/ci, ..., k^ = n} 
is equal to the number of symmetric r * r matrices with non-negative integer entries and with ith 
row sum ki — /cj-i, where by convention ko = 0. 

Proof: Theorem |^ shows that it suffices to count the number of multisets of primitive necklaces on 
an alphabet of ki — ki^i f's, where each necklace has length 1 or 2. Note that a primitive necklace 
of length 2 consists of a pair of distinct elements. So for i ^ j, let Xij be the number of pairs 
of letter i with letter j, and let Xa be the number of singleton Vs. The matrix {Xij) has all the 
desired properties. □ 



5 Inversion and Descent Structure After a Shuffle 

It is natural to study the inversion and descent structure of a permutation obtained after a biased 
riffle shuffle. Recall that vr is said to invert the pair {i,j) with i < j ii Tr[i) > vr(j). The number 
of inversions of vr is the number of pairs which tt inverts and will be denoted Inv{7r). It is easy 
to see that Inv{7r) = Inv{iT~^) and that Inv{7r) is the length of vr in terms of the generators 
{{i, i + 1) : 1 < i < n — 1}. Theorem S will give a g-exponential generating function for Inv after a 
biased riffle shuffle. This uses the notation: 



n 



n-l 

Y[(l+q + ...+q^ 
i=0 



[k]\[n-k]\ 



As usual, En^a,p denotes expectation with respect to the measure Pn,a,p- As will be explained 
in the course of the proof, the second equality in Theorem |5| is purely formal in the sense that it 
only holds if Ig'l < 1, and thus only the first equality should be used for the purpose of computing 
moments. 



Theorem 5 



n=0 ^ J- i=l j=0 ^"'J- 



UUt 



Proof: The following identity is clear from elementary manipulations and the definition of q- 
multinomial coefficients: 



E E 

n=0 bi>0 

bi-\ yha — n 



Pi Pa 



bi--ba 



U 



n\ 



{upi 



HE Iff. 



J=i J 



The left-hand side can be rewritten as: 



oo rt 



^0 N! ^ 



bi>0 
blH yba—n. 



n 



■ba, 






61 • 



(fel-fea) 



Since Inf (tt) is equal to Inv{iT^^), it is sufficient to analyze the number of inversions in the 

inverse of a permutation chosen from the measure Pn,a\p- Recalling the first description of biased 

riffle shuffling in Section ||, note that the term in brackets corresponds to picking the packet sizes 

bi,- ■ ■ ,ba according to the mult{a;'p) law. Prom pages 22 and 70 of Stanley |T^, it is known that 

^ "^ is the sum of q^'^'"^'^' over all tt in Sn with descent set contained in {61, bi + b2-,- ■ ■ ,bi + 

■ ■ ■ + ha = n} and that {^ ""^^ ) is the number of permutations with descent set contained in {61, 61 + 
b2, ■ ■ ■ ,bi + ■ ■ ■ + ha = n} . These observations prove the first equality of the theorem. 

The second equality follows from a famous identity of Euler, which is true if \x\, \q\ < 1: 



nr 



1 



xq" 



^M 



j=0 



{l-q)---{l-q^) 



a 



Theorem |5| can be used to compute the expected number of inversions after a /c-fold convolution 
of a Pn,a;p shufflc. However, we prefer the following direct probabilistic argument. 

Theorem 6 The expected number of inversions under the k-fold convolution of Pn,a\p is: 

%i-{pi+---+pin 

Proof: Por 1 < i < j < n, define a random variable Xi^j as follows. In the inverse model of card 
shuffling, let Xij = 1 if card i goes to a pile to the right of card j, and let Xjj- = otherwise. It is 
easy to see that if vr is the permutation obtained after the shuffle, then 7r(i) > 7r(j) exactly when 
Xjj = 1. Thus, 



Inv 



Xi 



E ^«.J- 

l<i<j<n 



It is clear that each Xij has expected value — ^ ^ — yaj_^ because this is one half the chance 
that cards i and j fall in different piles. The theorem now follows by linearity of expectation. □ 

Remarks 



1. Note that a uniformly chosen element of Sn has on average -^ inversions. In fact the 
distribution for inversions on Sn is the sum Xi + • • • + X„ where the Xi are independent and 
uniform on [0, i — 1]. 



2. By Holder's inequality, the expected number of inversions is maximum for k unbiased a shuffles 

(which is the same as an a^ shuffle), and in this case is -^[1 — t]- For instance, a 1 shuffle of 
a sorted deck gives no inversions, and a 2 shuffle of a sorted deck gives a permutation which 
has on average one half as many inversions as a random permutation. 

3. It would be interesting to use Theorem ^ to study the asymptotics of inversions after a biased 
riffle shuffle. Even for the case a = 2,pi = P2 = ^, it is not known if the n —>■ oo limit 
distribution is normal. 

4. The same technique used in Theorem ^ can be used to study the distribution of Des{TT), 
the number of descents of a permutation vr after a biased riffle shuffle. For example, using 
the convention that all elements of S'„, have a descent at position n, the expected number of 
descents would be 

It is perhaps surprising that these moments can be computed so easily. One reason to be sur- 
prised is that in the case of unbiased shuffles, Bayer and Diaconis ^ showed that Des{TT^^) is 
a sufflcient statistic for the random walk. Nevertheless, computing the moments of Des{TT^^) 
is more difflcult than computing the moments of Des^n), as a glance at the work of Mann 
|1C] will make clear. 
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